Faster Population Counts Using AVX2 Instructions

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Faster Population Counts Using AVX2 Instructions

Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g., popcnt on x64 processors). Maybe surprisingly, we show that a vectorized approach using SIMD instructions can be twice as fast as using the dedicated inst...

متن کامل

Faster Base64 Encoding and Decoding Using AVX2 Instructions

Web developers use base64 formats to include images, fonts, sounds and other resources directly inside HTML, JavaScript, JSON and XML files. We estimate that billions of base64 messages are decoded every day. We are motivated to improve the efficiency of base64 encoding and decoding. Compared to state-of-the-art implementations, we multiply the speeds of both the encoding (≈ 10×) and the decodi...

متن کامل

Faster Incoherent Ray Traversal Using 8-Wide AVX Instructions

Efficiently tracing randomly distributed rays is a highly challenging problem on wide-SIMD processors. The MBVH (multi bounding volume hierarchy) is an acceleration structure specifically designed for incoherent ray tracing on processors with explicit SIMD architectures like the CPU. Existing MBVH traversal methods for CPUs target 4-wide SIMD architectures using the SSE instruction set. Recentl...

متن کامل

Faster Set Intersection with SIMD instructions by Reducing Branch Mispredictions

Set intersection is one of the most important operations for many applications such as Web search engines or database management systems. This paper describes our new algorithm to efficiently find set intersections with sorted arrays on modern processors with SIMD instructions and high branch misprediction penalties. Our algorithm efficiently exploits SIMD instructions and can drastically reduc...

متن کامل

Technical Report Improvement of Fitch function for Maximum Parsimony in Phylogenetic Reconstruction with Intel AVX2 assembler instructions

The Maximum Parsimony problem aims at reconstructing a phylogenetic tree from DNA, RNA or protein sequences while minimizing the number of evolutionary changes. Much work has been devoted by the Computer Science community to solve this NP-complete problem and many techniques have been used or designed in order to decrease the computation time necessary to obtain an acceptable solution. In this ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The Computer Journal

سال: 2017

ISSN: 0010-4620,1460-2067

DOI: 10.1093/comjnl/bxx046